Applied Machine Learning for Senior Leaders

What are we going to learn?

  • Exploring a real data science problem
  • Understanding how ML can and can’t add value
  • Deploying and evaluating various approaches
  • Think about what might come next

Today…

  1. Finding and exploring a dataset
  2. Making our first predictions
  3. How did we do?
  4. Applying machine Learning
  5. Where do we go from here?

From data science to AI?

Understanding your data

The Kaggle Titanic Dataset

… now we can begin exploring!

Practical 1 - Exploring data

The Unreasonable Effectiveness of Linear Regression

Linear Models

But we’re not restricted to one factor!

What are we doing?

  • \(y_i = \beta_0 + \kappa T_i + \beta_1 X_{1i} + ... +\beta_k X_{ki} + u_i\)
  • Really, we’re just fitting a line
  • But that line can get super, super squiggly
  • Machine learning is just using compute to make the best multidimensional squiggle

Practical 2 - Linear Models

From Stats to Machine Learning

Testing and Validating

  • Traditional statistics often predicted “in sample”
  • Instead, we use test data (WHICH IS WAY EASIER)
  • So think really hard about:
    • whether your test set is meaningful
    • what baseline performance is

Practical 3 - Metrics

Metrics that matter

Linear Models Recap

  • Linear models are generally excellent places to start
  • Use them to think about your data: how it’s built, how it’s connected, and what you’re aiming to achieve
  • With a bit of work, these can perform shockingly well
  • but now it’s time to go deeper

Into the forest of ML

The Python ML Ecosystem

The sklearn paradigm

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0)
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb..predict(X_test)

Practical 4 - Machine Learning

Cutting Edge Machine Learning (it’s always XGBoost)

Gradient, Loss and Boosting - When lines get squiggly

So how do we fit the super squiggly line?

  • Neural networks approaches rely on gradient descent to maximise how well they fit
  • Gradient boosters rely on combining (or boosting models)
  • Combining your forecasts can be hugely effective

Tous Ensemble!

Combining Models

  • Boosting is only one approach
  • Just like Random Forests, the best models are often combinations
  • Lets implement it!

Risks and Benefits to Ensemble

  • Has your model performed better?
  • What is the implication for fit and variance?

Optional Materials

Class Imbalance

  • How imbalanced is our model?
  • what impact do you think this has in reality?
  • There are a range of approaches:
    • Weighted Errors
    • Bespoke algorithms (SMOTE etc)

Where do we go from here?

Text and NLP

  • We haven’t used the name category at all
  • How would you extract value from it?

Image Recognition

  • Thinking beyond tabular data, how could what you’ve learnt be applied to images?

Evidence House and the AI Digest

Thanks!

  • avarotsis@no10.gov.uk
  • Andreas Varotsis @ Kaggle